Add media file reading in filesystem server #2382
Draft
+120
−47
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SEE ADDITIONAL CONTEXT SECTION AT BOTTOM FOR INFO ON SDK ERROR ON LARGE FILES. IN DRAFT WHILE INVESTIGATING.
Summary of Changes
This pull request enhances the file system interaction capabilities by clarifying the purpose of the existing text file reading tool through a rename, and by introducing a dedicated tool for handling binary media files. This allows the system to process and return image and audio data in a base64 format, significantly expanding the types of files that can be directly accessed and utilized.
Highlights
read_file
tool has been renamed toread_text_file
to more accurately reflect its function of reading text-based file content.read_media_file
, has been introduced. This tool is designed to read image and audio files and return their content as base64 encoded data, along with the appropriate MIME type.@modelcontextprotocol/sdk
dependency has been updated to version1.16.0
, which also brings in a new transitive dependency,eventsource-parser
.filesystem
README has been updated to document the renamedread_text_file
tool and provide details for the newly addedread_media_file
tool.Changelog
@modelcontextprotocol/sdk
from1.12.3
to1.16.0
.eventsource-parser
as a new transitive dependency.read_file
toread_text_file
.read_text_file
to specify text-only reading and mentionhead
/tail
parameters.read_media_file
tool, detailing its purpose and input parameters.createReadStream
for streaming file content (line 13).ReadFileArgsSchema
toReadTextFileArgsSchema
(line 120).ReadMediaFileArgsSchema
for the new media tool (line 126).readFileAsBase64Stream
utility function to efficiently read and base64 encode file content (lines 482-495).ListToolsRequestSchema
handler to renameread_file
toread_text_file
and include the newread_media_file
tool definition (lines 502-519).CallToolRequestSchema
handler to processread_text_file
calls under its new name and added a new case forread_media_file
to handle media file reading, MIME type detection, and base64 encoding (lines 631-694).@modelcontextprotocol/sdk
from1.12.3
to1.16.0
.Server Details
Motivation and Context
Currently, the
server-filesystem
MCP server has aread_file
tool, which has optionalhead
andtail
params and always treats the file as utf-8 text.Issue #533 contains a user report:
This PR adds support for reading media files (audio and image) with a new
read_media_file
tool, and the oldread_file
tool is renamed toread_text_file
for clarity.How Has This Been Tested?
Using the Inspector UI:
Using the Inspector CLI
Breaking Changes
The tool name
read_file
is nowread_text_file
and its description is updated to reflect that it only works with text files. This should not present an issue to clients, since it will make the decision of when to use the tool clearer.Types of changes
Checklist
Additional context
Error when reading large files
Maximum call stack size exceeded
des.png
that is 4mb in size:Calling the tool for a large file with Inspector UI
Calling the tool for a large file with Inspector CLI
Process of elimination:
It isn't the function
It isn't the server
It isn't the Inspector
Inspector UI
Inspector CLI
Found culprit in SDK
Protocol.ts
is executed when the response is received. It parses theresultSchema
.result = resultSchema.parse(response.result);
withconst result = response.result;
The error no longer occurs with the large file:

I explored lots of ways to make the compiled schema less massive using
z.lazy()
andz.discriminatedUnion()
but in the end that didn't fix it.The actual culprit turned out to be Zod's .base64() validation.
Under the hood, z.string().base64() uses a regular expression to validate the string. While this regex is fine for typical inputs, running it against a string that is several megabytes long can cause the JavaScript engine's regex parser to hit its internal recursion limit, resulting in a "Maximum call stack size exceeded" error.
Problem solved
What was needed was a more robust base64 checker in the SDK. I added one and it fixed the problem. I've created a PR that fixes the problem.
In the Inspector CLI and UI, tested the large file that was causing the "Maximum call stack size exceeded" error. It is no longer present and the 4mb file can be processed.
Inspector CLI
Inspector UI